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Abstract 



Until now, design of the annual influenza vaccine has relied on phylogenetic or whole- 
sequence comparisons of the viral coat proteins hemagglutinin and neuraminidase, 
with vaccine effectiveness assumed to correlate monotonically to the vaccine-influenza 
sequence difference. We use a theory from statistical mechanics to quantify the non- 
monotonic immune response that results from antigenic drift in the epitopes of the 
hemagglutinin and neuraminidase proteins. The results explain the ineffectiveness 
of the 2003-2004 influenza vaccine in the United States and provide an accurate 
measure by which to optimize the effectiveness of future annual influenza vaccines. 
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1 Introduction 



Antigenic variation constitutes one mechanism employed by influenza viruses 
to evade the adaptive response of the host immune system. This antigenic drift 
of the recognized, epitope regions of the viral surface proteins hemagglutinin 
(HA) and neuraminidase (NA) constitutes a major challenge to effective vac- 
cine design, where historical experience and phylogenetic analysis of HA and 
NA protein sequences from circulating human strains are used to decide the 
components of the annual influenza vaccine. Here we introduce a theory to 
guide this important public health decision. Application of the theory could 
help prevent critical situations such as occurred with 2003-2004 influenza epi- 
demic [1], whence the administered A/Panama/2007/99 H3N2 vaccine gave 
unexpectedly [2] low protection against the mutant strain A/Fujian/411/2002. 
A model from statistical mechanics is used to evaluate the non-linear decrease 
of the immune response due to mutations in the viral epitope region sequences. 
We propose that this epitope analysis be regularly used as a measure of the 
immunological distance between mutant strains in the annual design of the 
influenza vaccine. 

Influenza A virus infections and posterior complications, such as pneumonia, 
are a major cause of human morbidity and mortahty. The 2003-2004 influenza 
epidemic was mainly due to the proliferation of the new H3N2 subtype strain 
A/Pujian/411/2002, an antigenic drift mutant of A/Panama/2007/99. Accord- 
ing to the February 2003 WHO report [2] , after comparing the whole hemag- 
glutinin (HA) sequences of both strains, the CDC council members concluded 
that both proteins were similar enough to expect a significant degree of cross 
protection, and decided to include the Panama strain in the H3N2 component 



3 



of the vaccine. No information concerning the neuraminidase sequence for the 
Fujian strain was used. Recent chnical results, as stated in the 16 January 
2004 CDC Morbidity and Mortahty Report [1], show the vaccine provided 
essentially no protection against infection during the 2003-2004 season. 



2 Methods 



We have developed a theory of the immune response to an antigenic drift 
strain after vaccination based on statistical mechanics, Figure 1 [3]. The model 
predicts the affinity constant values 

^eq ^ [Antigen : Antibody] 
[Antigen] [Antibody] 

for a second antigen, after exposure to an original antigen whose epitope region 
differs by probability Pepitope- The key measure of antigenic drift in the theory 

is Pcpitopc, the fractional change between the dominant epitope regions of the 
vaccine and the circulating strain, defined by the equation 

number of mutations within the epitope 
Pepitope number of amino acids within the epitope 

This characterization of antigenic drift in our theoretical model emphasizes the 
experimental fact that only the epitope regions are significantly involved in 
immune recognition, as shown by immunoassays and crystallographic images 
[4]- 

In the theory, it is the percent of the epitope that changes that characterizes 
antigenic drift. To provide additional empirical support for the theory, we per- 
formed an historical analysis of the influenza seasons between 1991-2000 when 
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the H3N2 virus suptype was dominant [5,6,7,8,9,10,11,12,13]. For every sea- 
son, hemagglutinin sequences [14] were compared between the vaccine strain 
and the predominant circulating strain. A quantitative scale was defined to 
measure the seasonal flu severity as follows: low (1), mild (2) and high (3). 
The values of seasonal flu severity were correlated with the calculated ^epitope 
values. In addition, to make clear that the epitope region is primarily respon- 
sible for immune recognition, we also correlated the seasonal flu severity with 
antigenic drift of the entire hemagglutinin sequence, normalized by the total 
number of amino acids in the protein Psequence- The results presented in Figure 
2 show that seasonal flu severity is correlated with Pepitope rather than ^sequence, 
thus favoring the epitope analysis approach. Therefore, it is both logical and 
consistent with the observed data to characterize antigenic drift by the number 
of mutations within the epitope regions, as we do in the present work. 

According to historic clinical experience, and to our model, when the antigenic 
drift between the vaccine and circulating strain, characterized by the ^epitope 
value, is small, exposure to the vaccine antigen leads to a higher affinity con- 
stant than without exposure. This result is why immune system memory and 
vaccination are generally effective. When the antigenic drift between the vac- 
cine and circulating strain is large, the vaccine antigen is uncorrelated with 
the circulating strain antigen, and so immune system memory does not play 
a role. When the antigenic drift between the vaccine and circulating strain 
epitopes is modest (0.23 < Pepitope < 0.6), our theory predicts that memory 
response may be worse than the naive response (the solid curve lies below 
the dashed curve in Figure 1), which means that the immunological memory 
from the vaccine exposure actually gives worse protection, i.e., a lower affinity 
constant, than would no vaccination whatsoever. This result is the original 
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antigenic sin phenomena for influenza: vaccination creates memory sequences 
that for some mutation rates of influenza may increase susceptibility to future 
exposure [15,16]. Parenthetically, not every infectious disease exhibits original 
antigenic sin, with measles one such example. The measles virus does not un- 
dergo antigenic drift, and despite approximately eight different subtypes that 
have been identified to date [17], the HA and NA genetic variation among 
them does not exceed 7% on a nucleotide basis [17], or roughly 2% on an 
amino acid basis. Accordingly, within the context of our model, there is no 
possibility of original antigenic sin for measles (since Pepitope < 0.02, see Figure 
1)- 

Original antigenic sin stems from localization of the immune system response 
in antibody sequence space. This localization is a result of the roughness in 
sequence space of the evolved antibody affinity constant for antigen. Interest- 
ingly, there appears to have been a modest degree of original antigenic sin, 
termed negative vaccine effectiveness in the CDC Morbidity and Mortality 
report [1] , associated with the 2003-2004 influenza vaccine. 

Human influenza A viruses are classifled in different subtypes according to the 
neuraminidase and hemagglutinin proteins. The current influenza A vaccine 
includes both the HlNl and H3N2 subtypes, and the consensus sequence for 
the HA and NA proteins corresponding to each subtype requires annual up- 
date due to continuous antigenic drift. Variations due to point mutations in 
the residues in the epitope regions can considerably reduce the immune re- 
sponse, despite biochemical cross activity between strains related by antigenic 
drift. The epitope regions of the HA and NA proteins are shown in Figure 
3. We propose that antigenic drift mutants be compared not by the whole 
sequences of the HA and NA proteins, but more precisely by the sequences 
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of the dominant epitopes for the proposed vaccine strain, by calculating the 
Pcpitopc parameter of our theory. According to the definition (2), a different 
value of Pepitope IS obtained for each epitope region in both hemagglutinin and 
neuraminidase viral proteins. We propose to include in the analysis only the 
Pcpitope values corresponding to the dominant epitopes in both proteins. Since 
both hemagglutinin and neuraminidase participate in the immune recognition 
process, it is some combination of the immune recognition of these two proteins 
that contributes to reducing the seasonal flu severity. We, thus, define an ap- 
proximate total response as the binding constant at the average of the Pepitope 
values for the dominant HA and NA epitopes: Pavg = |(p^ttope +P^pttope)- 

3 Results 

We compared the epitope sequences of the HA protein to look for mutations 
in the A/Fujian/411/2002 strain [14] with respect to the A/Panama/2007/99 
strain [18]. The hemagglutinin H3 protein has five epitope regions (A, B, C, 
D, E) that have been identified and sequenced [19], among which A and B are 
usually dominant [20,21]. There exists experimental and clinical evidence that 
epitope regions mutate much faster than other regions in the viral proteins, 
presumably due to antibody selective pressure [22,23,24], with the dominant 
epitopes mutating most rapidly [25]. Therefore, in the absence of more detailed 
information, we take an observed high mutation rate (i.e., a high Pepitope value) 
in a given epitope to correlate with dominance. Epitope A (residues 122, 124, 
126, 130-133, 135, 137, 138, 140, 142-146, 150, 152, 168) presents one point 
mutation at residue 131. The calculated value Pepitope — 1/19 = 0.053. Epitope 
B (residues 128, 129, 155-160, 163, 165, 186-190, 192-194, 196-198) presents 
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three point mutations at residues 155, 156, and 186. The calculated value 
Pcpitopc = 3/21 = 0.14. Epitope C (residues 44-48, 50, 51, 53, 54, 273, 275, 276, 
278-280, 294, 297, 299, 300, 304, 305, 307-312) presents one point mutation at 
residue 50. The calculated value Pepitope = 1/27 = 0.037. Epitope D (residues 
96, 102, 103, 117, 121, 167, 170-177, 179, 182, 201, 203, 207-209, 212-219, 226- 
230, 238, 240, 242, 244, 246-248) presents no mutations. Epitope E (residues 
57, 59, 62, 63, 67, 75, 78, 80-83, 86-88, 91, 92, 94, 109, 260-262, 265) presents 
two point mutations at residues 75 and 83. The calculated value PepHope = 
2/22 = 0.09. In absence of further information, which is the typical case for 
the annual task of influenza vaccine design, we conclude that likely B epitope 
is dominant and E epitope is subdominant for the A/Panama/2007/99 HA 
protein, whereas the other epitopes are cryptic. The dominance of epitope B 
is in accordance with observed data [26] . Note that by looking at the antigenic 
drift within the dominant epitope, rather than the drift of the whole protein 
sequence, we obtain a larger and much more accurate estimate of the degree to 
which the immune response to A/Fujian/411/2002 and A/Panama/2007/99 
will differ. 

We also calculated the values for antigenic drift in the NA epitopes be- 
tween the A/Fujian/411/2002 [27] and A/Panama/2007/99 [28] strains. The 
neuraminidase N2 protein has been completely sequenced and crystallized 
[29,30]. Mutational studies with monoclonal antibodies identified three re- 
gions (A,B,C) in NA N2 that are important for recognition [4,31], of which 
only the surface residues can be within the epitopes. Regions A and B are usu- 
ally dominant [31]. Epitope A (residues 383-387, 389-394, 396, 399, 400, 401, 
403) presents three point mutations in at residues 385, 399, and 403. The cal- 
culated value Pepitope = 3/16 = 0.188. Epitope B (residues 197-200, 221, 222) 
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presents two point mutations at residues 197 and 217 (although this residue is 
not present in the epitope, its mutation will affect the epitope due to physical 
proximity). The calculated value Pepitope = 2/6 = 0.33. Epitope C (residues 
328-332, 334, 336, 338, 339, 341-344, 346, 347, 357-359, 366-370) presents one 
point mutation at residue 370. The calculated value Pepitopc = 1/23 = 0.043. 
We conclude that likely epitope B is dominant, and epitope A is subdominant 
for the A/Panama/2007/99 NA protein, whereas epitope C is cryptic. 

4 Discussion 

In Figure 1 are shown the predicted immune responses to the A/Fujian/411/2002 
strain for the hemagglutinin (green) and neuraminidase (red) dominant epi- 
topes after vaccination to A/Panama/2007/99. The predicted values for hemag- 
glutinin lie in the region of moderate immune response, and so consistent with 
the WHO data [2] one would expect some degree of cross-strain protection. 
However, the predicted immune response to the dominant neuraminidase epi- 
tope is in the region of original antigenic sin. In the design of the 2003-2004 in- 
fluenza vaccine, neither the cross activity nor the immune response were mea- 
sured for the A/Fujian/41 1/2002 NA protein in response to A/Panama/2007/99 
NA vaccination [2] . Upon analysis of actual effectiveness of the 2003-2004 vac- 
cine, there does appear to have been a modest degree of original antigenic sin, 
or negative vaccine effectiveness [1]. 

Summarizing our findings, it would appear that for the hemagglutinin protein 
of A/Panama/2007/99, epitope B is dominant, and vaccination gives modest 
protection to the A/Fujian/411/2002 strain. For the neuraminidase protein 

of A/Panama/2007/99, it would appear that epitope B is also dominant, and 
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vaccination may increase the susceptibility to the A/Fujian/411/2002 strain. 
Taken in aggregate, these results, Figure 1, suggest that the 2003-2004 flu vac- 
cine would have essentially no effect against the circulating A/Fujian/41 1/2002 
strain, in agreement with clinical flndings [1] and in disagreement with early 
expectations [2]. 



In conclusion, we suggest that strains related by antigenic drift be compared 
by measuring differences in the epitope regions of the hemagglutinin and neu- 
raminidase proteins and not by differences in the whole sequence or phylogcny 
as is presently done. This particular point is supported by the correlations of 
seasonal flu severity with epitope antigenic drift shown in Figure 2. Thus, there 
is a need for a detailed characterization of the epitope regions in the differ- 
ent influenza strains. In particular, a precise determination of which epitopes 
are dominant in the proposed vaccine strain and an experimental measure of 
Pepitope and of cross activity between the proposed vaccine strain and the cir- 
culating strains would be highly productive. In absence of this determination, 
we suggest to estimate the dominant epitope as that which shows the most 
antigenic drift [22,23,24,25], as we do in the present work. From either epitope 
sequence drift or cross activity. Figure 1 can be used to estimate the degree 
of the immune response, which is a non-linear and non-monotonic function of 
the measured data. We believe that this quantitative epitope analysis should 
be incorporated as part of the regular protocol for construction of the annual 
influenza vaccine. 
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Figures 

Fig. 1. The evolved affinity constant to a second antigen after exposure to an original 
antigen whose epitope region differs by probability Pepitope (solid line) . The dashed 
line represents the affinity constant without previous exposure. In green are shown 
the responses at the values of the differences between the A/Panama/2007/99 (vac- 
cine) and A/Fujian/411/2002 (circulating) strains for the B and E hemagglutinin 
epitopes. In red are shown the responses at the values of the difference between 
the A/Panama/2007/99 (vaccine) and A/Fujian/411/2002 (circulating) strains for 
the A and B neuraminidase epitopes. Dominant epitopes are shown in bold. The 
clinical outcome is an average of the response to the HA and NA proteins, and in 
purple is shown the immune response at the average difference for the dominant 
HA and NA epitopes (see text) . The effectiveness of the 2003-2004 flu vaccine was 
marginal at best. In inset is shown the cross affinity of the memory sequences for 
the mutated antigen, often measured biochemically and distinct from the evolved 
immune response. As if often found, the cross activity decreases exponentially with 
antigenic drift [32]. 

Fig. 2. a) Correlation between influenza seasonal severity (see text) and hemagglu- 
tinin antigenic drift, calculated by epitope analysis. Least-squares regression anal- 
ysis yields the linear fit y = 1.5425 -|- 5.9162pepitope) with a correlation coefficient 
R = 0.54432. b) Correlation between influenza seasonal severity (see text) and 
hemagglutinin antigenic drift, calculated by whole sequence analysis. Least-squares 
regression analysis yields the linear fit y = 2.0041 -|- 11.684psequence) with a correla- 
tion coefficient R = 0.2183. 
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Fig. 3. a) Shown axe the dominant B (top) and subdominant E (middle) epitope 
of the hemagglutinin protein in space-filling format [19]. b) Shown are the domi- 
nant B (right) and subdominant A (left) epitope of the neuraminidase protein in 
space- filling format [30]. The rest of the proteins are shown in ribbon format. 



17 




0.2 0.4 0.6 0.8 1 

Pepitope 



Figure 1. Munoz and Deem, "Epitope analysis . . . . 
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Figure 2. Munoz and Deem, "Epitope analysis " 
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Figure 3. Munoz and Deem, "Epitope analysis . 
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